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ABSTRACT 


Machine learning has shown great potential applications in material 
science. It is widely used in material design, corrosion detection, 
material screening, new material discovery, and other fields of 
materials science. The majority of ML approaches in materials 
science is based on artificial neural networks (ANNs). The use of ML 
and related techniques for materials design, development, and 
characterization has matured to a main stream field. This paper 
focuses on the applications of machine learning strategies for 


material characterization. 
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1. INTRODUCTION 

The rapid growth of material data from experiments, 
computations, and simulations is expanding beyond 
processable amounts. This massive amount of data 
may be due to the number of samples collected over 
time via experiments or simulations. It is not only an 
advantage to have a large data volume (a.k.a. big 
data) but it can also be a challenge to cope with 
tremendous amounts of data [1]. Machine learning 
methods have become inescapable in views this 
growing quantity of material data. In the era of big 
data, machine learning has become an integral part of 
our daily lives. 


Due to the high cost of traditional trial-and-error 
methods in materials research, material scientists 
have relied on simulation and modeling methods to 
understand and _ predict materials properties. 
Traditionally, the field of materials science relies on 
experiments and simulation-based tools for material 
characterization. Simulations can be highly 
demanding in terms of time and resources. Recently, 
machine learning (ML) methods for property 
prediction and material design have attracted a lot of 
attention. ML algorithms have been successful in 
expediting, enhancing, and completing traditional 
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modeling and simulation capabilities [2]. Materials 
engineers have leveraged machine learning tools to 
advance characterization methods in materials science 
and engineering. 


2. OVERVIEW ON MACHINE LEARNING 

Machine learning (ML) is an emerging branch of 
artificial intelligence that focuses on optimizing 
computer programs to improve algorithms through 
data and researching experience. It is the discipline 
that gives computers the ability to learn without being 
explicitly programmed. The term “machine learning” 
(ML) was initially coined in 1959 by Arthur Samuel, 
a computer scientist. Machine learning (or statistical 
learning) is part of artificial intelligence. It assists 
computers in estimating future events and modelling 
based on experiences gained from previous 
information. Machine learning (ML) focuses on how 
computers “learn” from data. It allows computers to 
learn from past examples and detect hard-to-discern 
patterns from large data sets. It describes a class of 
algorithms which learn model parameters from a set 
of training data with the purpose of accurately 
predicting outcomes for previously unseen data. ML 
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is a marriage between statistics and computer science 
[3, 4]. 


As shown in Figure 1, there are two types of learning: 
supervised learning and unsupervised learning [5]. 
Supervised learning focuses on classification and 
prediction. It involves building a statistical model for 
predicting or estimating an outcome based on one or 
more inputs. It is often used to estimate risk. 
Supervised ML is where algorithms are given training 
data. Learning from data is used when there is no 
theoretical or prior knowledge solution, but data is 
available to construct an empirical solution. In 
unsupervised learning, we are interested in finding 
naturally occurring patterns within the data. Unlike 
supervised learning, there is no predicted outcome. 
Unsupervised learning looks for internal structure in 
the data. Unsupervised learning algorithms are 
common in neural network models. A common 
application of such a process is to explore 
interrelationships between genetics, biochemistry, 
histology, and disease states. In addition to supervised 
and unsupervised learning methods, there are also 
growing method of semi-supervised learning. 


Machine learning is a pure data operation. Its main 
objective is material prediction. Machine learning 
uses large amounts of data to continuously optimize 
models and to make reasonable predictions. The 
traditional machine learning methods (shallow 
learning) require features to be selected manually. 
They typically begin with raw data and end with a 
predictive model that can be used to make decisions. 
The process usually includes the following steps [6]: 


1. Data Gathering to identify and collect input data. 


2. Data Cleansing to standardize and clean the raw 
inputs. 


3. Feature processing to transform the input data 
into formats that can be easily processed to 
identify the best predictor variables. 


4. Model Training to train the model, using a wide 
range of potential algorithms. 


5. Model Validationto test the model against 
historical data and assess its performance. 


6. Model Deployment to load the model into an 
environment where it can make decisions. 


The applications of machine learning are endless, 
including medicine, machine perception, computer 
vision, object recognition, natural language 
processing, cheminformatics, fraud detection, stock 
market analysis, games, robotics, health monitoring. 
Industrial leaders such Google, Amazon, and 
Microsoft are now offering numerous tools to enable 


beginners get started with building their own machine 
learning systems. 


3. CHACTERIZATION OF MATERIALS 
Characterization is a fundamental process in materials 
science. It refers to the general process by which a 
material's structure and properties are probed and 
measured [7]. Material characteristics such as 
strength, toughness, hardness, brittleness, or ductility 
are useful for categorizing a material. In materials 
science, researchers rely on experiments and 
simulation for material characterization. 


Traditional material tests like tensile tests, 
compression tests, or creep tests are often time 
consuming and expensive to perform. Tensile 
properties indicate how the material will react to 
forces being applied in tension. Typical destructive 
tests are bend test, impact test, hardness test, tensile 
test, fatigue test, corrosion resistance test, or wear 
test. 


In the past, characterization of the material, raw or 
processed, was done mainly by laboratory 
experiments, which can become costly. In recent 
times, computational models and _ high-fidelity 
simulations have made the process more efficient [8]. 


Sometimes, statistical machine learning tools are 
trained on the available experimental results, and then 
used in place of the real experiments. The application 
of ML approaches is considered helpful for an easier 
generation of material property information. ML 
methods are trained on experimental datasets to 
accelerate the characterization of materials. The 
majority of early ML applications to materials science 
employed simple-to-use algorithms, like linear kernel 
models and decision trees. To solve material 
problems using ML requires datasets to help detect 
target features. 


The guiding ideology of materials science can be 
summarized in four paradigms. The first paradigm is 
the empirical trial and error method. The second 
paradigm is physical and chemical laws. The third 
paradigm is computer simulation, and the fourth 
paradigm is big data-driven science. The fourth 
paradigm can perfectly unify the other three 
paradigms in the aspects of theory, experiment, and 
computer simulation [9]. Figure 2 shows an 
application of machine learning in materials science 
[10]. 


Materials scientists have demonstrated success in 
utilizing ML-based methods in two major categories: 
(1) to accelerate the prediction of material properties 
for specific 
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applications and (2) to accelerate the on-demand 
design and the optimization of material composition. 
The superior effectiveness of ML techniques is 
demonstrated, compared to traditional simulation or 
experimental approaches [11]. Machine learning has 
been applied in studying the properties of inorganic 
materials. For metallic biomaterials, an attractive 
application of ML in recent years is in medical 
implant. It is expected that machine learning is 
applied in bio-related material science [12]. 


4. BENEFITS 

Machine learning (ML) is widely used in several 
aspects of material science such as new material 
discovery, detection of material properties, material 
prediction, material design, inverse design, corrosion 
detection, material screening, and data preprocessing 
(data collecting and data cleaning) [13]. ML can be 
used to predict proper new compounds. A major 
advantage of applying ML approaches is that it is not 
necessary to postulate a mathematical model at first. 
Machine learning allows one to replace the traditional 
tests with simple and fast tests. ML techniques have 
been utilized to predict material fatigue life for steel. 
They have become efficient tools for analyzing 
materials in the new field of material discovery. They 
can be applied in material discovery including data 
preprocessing, feature engineering, and machine 
learning algorithms. They have also been employed 
along with physics-based simulations to combine 
information from different sources. Experiment- and 
simulation-based data mining in combination with 
ML tools provide great opportunities to enable 
identification of fundamental interrelations within 
materials for characterization. The development of 
deep learning has made new progress in the 
application of materials science. Deep learning has 
high potential in the inverse design of materials. It is 
needless to say that we are only scratching the surface 
of what is possible with machine learning. 


5. CHALLENGES 

There have been many challenges in implementing 
ML techniques in materials science. At present, the 
ML approach is confronted with many challenges: the 
messy datasets must be preprocessed; the accuracy of 
the model is limited by its algorithms; the high- 
intensity computation places pressure on computing 
resources; etc. Complex ML algorithms are often 
treated as black boxes and lack novel understanding 
and knowledge arising from their use. There is often 
the problem of small data in material characterization, 
because the experimental or simulative generation of 
data is complex and expensive. For some 
applications, there is a lack of benchmarking datasets 
and standards. 


6. CONCLUSION 

The commonly used traditional trial-and-error 
approaches rely on personal experience and do not 
apply for new materials due to their long development 
cycles, low efficiency, and high costs. Recently, using 
machine learning to explore new materials is 
becoming popular. Various machine learning methods 
have successfully been 


used for the prediction of materials properties. More 

information on materials characterization using 

machine learning can be found in the books in [14] 

and the following related journals: 

> Materials Characterization 

> International Journal on Materials Structure and 
Behavior. 

> Machine Learning: Science and Technology 
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Figure 2 Machine learning in materials science 
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